On the reliability of information retrieval metrics based on graded relevance

نویسنده

Tetsuya Sakai

چکیده

This paper compares 14 information retrieval metrics based on graded relevance, together with 10 traditional metrics based on binary relevance, in terms of stability, sensitivity and resemblance of system rankings. More specifically, we compare these metrics using the Buckley/Voorhees stability method, the Voorhees/Buckley swap method and Kendall’s rank correlation, with three data sets comprising test collections and submitted runs from NTCIR. Our experiments show that (Average) Normalised Discounted Cumulative Gain at document cut-off l are the best among the rank-based gradedrelevance metrics, provided that l is large. On the other hand, if one requires a recall-based graded-relevance metric that is highly correlated with Average Precision, then Q-measure is the best choice. Moreover, these best graded-relevance metrics are at least as stable and sensitive as Average Precision, and are fairly robust to the choice of gain values. 2006 Elsevier Ltd. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Review of ranked-based and unranked-based metrics for determining the effectiveness of search engines

Purpose: Traditionally, there have many metrics for evaluating the search engine, nevertheless various researchers’ proposed new metrics in recent years. Aware of this new metrics is essential to conduct research on evaluation of the search engine field. So, the purpose of this study was to provide an analysis of important and new metrics for evaluating the search engines. Methodology: This is ...

متن کامل

The Role of the FUM Students' Demographic Features in the Relevance Judgment Scores of Their Information Retrieval Results in Search Engines

In order to design user-friendly information retrieval systems, it is important to pay attention to characteristics of users. Therefore, the aim of the present study is to investigate the role of demographic variables of users during their search in search engines. Method: This is an applied study in terms of purpose, which was done by the evaluation method. To conduct the research, firstly,...

متن کامل

On Penalising Late Arrival of Relevant Documents in Information Retrieval Evaluation with Graded Relevance

Large-scale information retrieval evaluation efforts such as TREC and NTCIR have tended to adhere to binary-relevance evaluation metrics, even when graded relevance data were available. However, the NTCIR-6 Crosslingual Task has finally started adopting graded-relevance metrics, though only as additional metrics. This paper compares three existing graded-relevance metrics that were mentioned in...

متن کامل

Controlling the Penalty on Late Arrival of Relevant Documents in Information Retrieval Evaluation with Graded Relevance

Large-scale information retrieval evaluation efforts such as TREC and NTCIR have always used binary-relevance evaluation metrics, even when graded relevance data were available. However, the NTCIR-6 crosslingual task has finally announced that it will use graded-relevance metrics, though only as additional metrics. This paper compares graded-relevance metrics in terms of the ability to control ...

متن کامل

Document Image Retrieval Based on Keyword Spotting Using Relevance Feedback

Keyword Spotting is a well-known method in document image retrieval. In this method, Search in document images is based on query word image. In this Paper, an approach for document image retrieval based on keyword spotting has been proposed. In proposed method, a framework using relevance feedback is presented. Relevance feedback, an interactive and efficient method is used in this paper to imp...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Inf. Process. Manage.

دوره 43 شماره

صفحات -

تاریخ انتشار 2007

On the reliability of information retrieval metrics based on graded relevance

نویسنده

چکیده

منابع مشابه

Review of ranked-based and unranked-based metrics for determining the effectiveness of search engines

The Role of the FUM Students' Demographic Features in the Relevance Judgment Scores of Their Information Retrieval Results in Search Engines

On Penalising Late Arrival of Relevant Documents in Information Retrieval Evaluation with Graded Relevance

Controlling the Penalty on Late Arrival of Relevant Documents in Information Retrieval Evaluation with Graded Relevance

Document Image Retrieval Based on Keyword Spotting Using Relevance Feedback

عنوان ژورنال:

اشتراک گذاری